AMENDMENTS TO CLAIMS 

This listing of claims will replace all prior versions, and 
listings, of claims in the application: 

Listing of Claims : 

1. (Currently Amended) A method for utterance 
verification implemented in a computer system having a 
feature vector extraction module, a speech recognition 
module, a speech segmentation module, a verification 
feature vector generation module, a verification score 
calculation module, a verification score combination 
module, and a decision module, the method comprising the 
steps of: 

(A) extracting a seguence of feature vectors from an 
input speech by the feature vector extraction module ; 

(B) inputting the seguence of feature vectors to a 

opccch recognizer the speech recognition module for 

obtaining at least one candidate string; 

(C) segmenting the input speech into at least one 
speech segment by the speech segmentation module according 
to the content of candidate string, which comprises 
individual recognition units, wherein each speech segment 
corresponds to a recognition unit and each recognition unit 
corresponds to a verification unit; 

(D) generating a seguence of verification feature 
vectors for each speech segment by the verification feature 
vector generation module according to the seguence of 
feature vectors of the speech segment, wherein the 
verification feature vectors are generated by normalizing 
the feature vectors using the normalization parameters of 
the verification unit corresponding to the speech segment, 
the normalization parameters of the verification unit are 
the means and the standard deviations of the feature 
vectors corresponding to the verification unit in training 



2 



data, and these parameters are calculated in advance of 
runtime ; 

(E) utilizing a verification-unit corresponded 
classifier for each speech segment to calculate the 
verification score by the verification score calculation 
module , where the seguence of verification feature vectors 
of the speech segment is used as the input of the 
classifier, wherein said classifier is a neural network and 
the neural network is an MLP (multi-layer perceptron), 
wherein the MLP is used to calculate the verification score 
by inputting the verification feature vector and performing 
a feed-forward operation, and wherein the verification 
score of a speech segment is the mean of the verification 
scores of the seguence of verification feature vectors 
corresponding to the speech segment ; 

(F) combining the verification scores of all speech 
segments for obtaining an utterance verification score of 
the candidate string by the verification score combination 
module ; and 

(G) comparing the utterance verification score of the 
candidate string with a predetermined threshold by the 
decision module so as to accept the candidate string if the 
utterance verification score is larger than the 
predetermined threshold. 

2 . (canceled) 

3 . ( canceled) 

4. canceled) 

5. (Original) The method as claimed in claim 3, 
wherein the MLP is trained by using an error back- 
propagation algorithm to reduce the mean sguare error 
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between the verification score output of the MLP and the 
target value. 

6. (Original) The method as claimed in claim 5, 
wherein the MLP corresponding to the verification unit is 
trained by inputting the seguences of verification feature 
vectors of the speech segments corresponding to the 
verification unit and the seguences of verification feature 
vectors of the speech segments not corresponding to the 
verification unit. 

7. (Original) The method as claimed in claim 6, 
wherein the target value is 1 if the speech segment 
corresponds to the verification unit and which is 0 if the 
speech segment does not correspond to the verification 
unit . 

8. (Original) The method as claimed in claim 1, 
wherein in step (F), the utterance verification score of 
the candidate string is the mean of the verification scores 
of the speech segments in the input speech. 

9. (Previously presented) The method as claimed in 
claim 1, wherein the input speech is corrupted by noise 
with different power levels of SNR (Signal to Noise Ratio) . 

10. (Previously presented) The method as claimed in 
claim 6, wherein the speech segments used for training are 
corrupted by noise with different power levels of SNR 
(Signal to Noise Ratio) . 

11. (Currently Amended) A system for utterance 
verification comprising: 

software modules implemented in a computer system, the 
software modules including: 
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a feature vector extraction module for extracting a 
sequence of feature vectors from an input speech; 

a speech recognition module for obtaining at least one 
candidate string by inputting the sequence of feature 
vectors ; 

a speech segmentation module for segmenting the input 
speech into at least one speech segment according to the 
content of candidate string, which comprises individual 
recognition units, wherein each speech segment corresponds 
to a recognition unit and each recognition unit corresponds 
to a verification unit; 

a verification feature vector generation module for 
generating a sequence of verification feature vectors for 
each speech segment according to the sequence of feature 
vectors of the speech segment, wherein the verification 
feature vectors are generated by normalizing the feature 
vectors using the normalization parameters of the 
verification unit corresponding to the speech segment, the 
normalization parameters of the verification unit are the 
means and the standard deviations of the feature vectors 
corresponding to the verification unit in training data, 
and these parameters are calculated in advance of runtime; 

a verification score calculation module for utilizing 
a verification-unit corresponded classifier for each speech 
segment to calculate the verification score, where the 
sequence of verification feature vectors of the speech 
segment is used as the input of the classifier, wherein the 
classifier is a neural network, and the neural network is 
an MLP (multi-layer perceptron) , wherein the MLP is used to 
calculate the verification score by inputting the 
verification feature vector and performing a feed-forward 
operation, and wherein the verification score of a speech 
segment is the mean of the verification scores of the 
sequence of verification feature vectors corresponding to 
the speech segment ; 



5 



a verification score combination module for combining 
the verification scores of all speech segments for 
obtaining an utterance verification score of the candidate 
string; and 

a decision module for comparing the utterance 
verification score of the candidate string with a 
predetermined threshold so as to accept the candidate 
string if the utterance verification score is larger than 
the predetermined threshold. 

12. (Canceled) 

13. (Canceled) 

14. (Canceled) 

15. (Original) The system as claimed in claim 13, 
wherein the MLP is trained by using an error back- 
propagation algorithm to reduce the mean square error 
between the verification score output of the MLP and the 
target value. 

16. (Original) The system as claimed in claim 15, 
wherein the MLP corresponding to the verification unit is 
trained by inputting the sequences of verification feature 
vectors of the speech segments corresponding to the 
verification unit and the sequences of verification feature 
vectors of the speech segments not corresponding to the 
verification unit. 

17. (Original) The system as claimed in claim 16, 
wherein the target value is 1 if the speech segment 
corresponds to the verification unit and which is 0 if the 
speech segment does not correspond to the verification 
unit . 
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18. (Original) The system as claimed in claim 11, 
wherein in the verification score combination module, the 
utterance verification score of the candidate string is the 
mean of the verification scores of the speech segments in 
the input speech. 

19. (Previously presented) The system as claimed in 
claim 11, wherein the input speech is corrupted by noise 
with different power levels of SNR (Signal to Noise Ratio) . 

20. (Previously presented) The system as claimed in 
claim 16, wherein the speech segments used for training are 
corrupted by noise with different power levels of SNR 
(Signal to Noise Ratio) . 



7 



